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Abstract 

Matchmaking arises when supply and demand meet in an electronic marketplace, or 
when agents search for a web service to perform some task, or even when recruiting agencies 
match curricula and job profiles. In such open environments, the objective of a matchmak- 
ing process is to discover best available offers to a given request. 

We address the problem of matchmaking from a knowledge representation perspective, 
with a formalization based on Description Logics. We devise Concept Abduction and Con- 
cept Contraction as non-monotonic inferences in Description Logics suitable for modeling 
matchmaking in a logical framework, and prove some related complexity results. We also 
present reasonable algorithms for semantic matchmaking based on the devised inferences, 
and prove that they obey to some commonsense properties. 

Finally, we report on the implementation of the proposed matchmaking framework, 
which has been used both as a mediator in e-marketplaces and for semantic web services 
discovery. 



1. Introduction 

The promise of the Semantic Web initiative is to revolutionize the way information is coded, 
stored, and searched on the Internet (Berners-Lee, Hendler, & Lassila, 2001). The basic 
idea is to structure information with the aid of markup languages, based on the XML 
language, such as RDF and RDFS 1 , and OWL 2 . These languages have been conceived 
for the representation of machine-understandable, and unambiguous, description of web 
content through the creation of domain ontologies, and aim at increasing openness and 
interoperability in the web environment. 

Widespread availability of resources and services enables — among other advantages — 
the interaction with a number of potential counterparts. The bottleneck is that it is difficult 
finding matches, possibly the best ones, between parties. 

The need for a matchmaking process arises when supply and demand have to meet in a 
marketplace, or when web services able to perform some task have to be discovered, but also 
when recruiting agencies match curricula and job profiles or a dating agency has to propose 
partners to a customer of the agency. Requests and offers may hence be generic demands 
and supplies, web services, information, tangible or intangible goods, and a matchmaking 
process should find for any request an appropriate response. In this paper we concentrate 

1. http://www.w3.org/RDF/ 

2 . http : / / www. w3 .org/TR /owl- features / 
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on automated matchmaking, basically oriented to electronic marketplaces and service dis- 
covery, although principles and algorithms are definitely general enough to cover also other 
scenarios. We assume, as it is reasonable, that both requests and offers are endowed of 
some kind of description. Based on these descriptions the target of the matching process 
is finding, for a given request, best matches available in the offers set, and also, given an 
offer, determine best matching requests in a peer-to-peer fashion. We may hence think of an 
electronic mediator as the actor who actively tries to carry out the matchmaking process. 
Obviously descriptions might be provided using unstructured text, and in this case such an 
automated mediator should revert to adopting either basic string matching techniques or 
more sophisticated Information Retrieval techniques. 

The Semantic Web paradigm calls for descriptions that should be provided in a struc- 
tured form based on ontologies, and we will assume in what follows that requests and offers 
are given with reference to a common ontology. It should be noticed that even when requests 
and offers are described in heterogeneous languages, or using different ontologies modelling 
the same domain, schema/data integration techniques may be employed to make them 
comparable, as proposed e.g., by Madhavan, Bernstein, and Rahm (2001), and Shvaiko and 
Euzenat (2005); but once they are reformulated in a comparable way, one is still left with 
the basic matchmaking problems: given a request, are there compatible offers? If there are 
several compatible offers, which, and why, are the most promising ones? 

Matchmaking has been widely studied and several proposals have been made in the past; 
we report on them in Section 2. Recently, there has been a growing effort aimed at the 
formalization with Description Logics (DLs) (Baader, Calvanese, Mc Guinness, Nardi, <fe 
Patel-Schneider, 2003) of the matchmaking process {e.g., Di Sciascio, Donini, Mongiello, <fe 
Piscitelli, 2001; Trastour, Bartolini, & Priest, 2002; Sycara, Widoff, Klusch, & Lu, 2002; Di 
Noia, Di Sciascio, Donini, & Mongiello, 2003b; Li k, Horrocks, 2003; Di Noia, Di Sciascio, 
Donini, & Mongiello, 2003c, 2003a, among others). DLs, in fact, allow to model structured 
descriptions of requests and offers as concepts, usually sharing a common ontology. Fur- 
thermore DLs allow for an open-world assumption. Incomplete information is admitted, 
and absence of information can be distinguished from negative information. We provide a 
little insight on DLs in Section 3. 

Usually, DL-based approaches exploit standard reasoning services of a DL system — 
subsumption and (un)satisfiability — to match potential partners in an electronic transac- 
tion. In brief, if a supply is described by a concept Sup and a demand by a concept Dem, 
unsatisfiability of the conjunction of Sup and Dem (noted as Sup PI Dem) identifies the in- 
compatible proposals, satisfiability identifies potential partners — that still have to agree on 
underspecified constraints — and subsumption between Sup and Dem (noted as Sup Q Dem) 
means that requirements on Dem are completely fulfilled by Sup. 

Classification into compatible and incompatible matches can be useless in the presence of 
several compatible supplies; some way to rank most promising ones has to be identified; also 
some explanation on motivation of such a rank could be appreciated. On the other hand, 
when there is lack of compatible matches one may accept to turn to incompatible matches 
that could still be interesting, by revising some of the original requirements presented in 
the request, as far as one could easily identify them. 

In other words some method is needed to provide a logic-based score for both compatible 
and incompatible matches and eventually provide a partial/full ordering, allowing a user 
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or an automated agent to choose most promising counteroffers. Furthermore it should be 
possible, given a score, to provide logical explanations of the resulting score, thus allowing 
to understand the rank result and ease further interaction to refine/revise the request. 
Although this process is quite simple for a human being it is not so in a logic-based fully 
automated framework. We believe there is a need to define non-monotonic reasoning services 
in a DLs setting, to deal with approximation and ranking, and in this paper we propose 
the use of Concept Abduction (Di Noia et al., 2003a) and Concept Contraction (Colucci, 
Di Noia, Di Sciascio, Donini, Sz Mongiello, 2003), as services amenable to answer the above 
highlighted issues in a satisfactory way. Contributions of this paper include: 

• a logical framework to express requests and offers in terms of concept descriptions, 
and properties that should hold in a matchmaking facilitator; 

• Concept Abduction as a logical basis for ranking compatible counteroffers to a given 
offer and provide logical explanations of the ranking result; 

• Concept Contraction as a logical basis for ranking incompatible matches, aimed at 
discovering most promising "near misses", and provide logical explanations of the 
ranking result; 

• algorithms implementing the formalized inferences for matchmaking purposes and 
complexity results for a class of matchmaking problems; 

• description of our system implementing semantic matchmaking services, and experi- 
mental evaluation. 

The remaining of the paper is structured as follows: next Section reports on background 
work on the subject. Then (Section 3) we briefly revise Description Logics basics. To make 
the paper self-contained we recall (Section 4) our logic-based framework for matchmaking, 
pointing out properties that matchmaking algorithms and systems should guarantee. In 
Sections 5 and 6 we present Concept Abduction and Concept Contraction, the two inference 
services we devised to compute semantic matchmaking, and present suitable definitions 
of the problem along with some complexity results. Then in Section 7 we describe our 
matchmaker, and present (Section 7.1) an evaluation of results computed by the system 
compared with human users behavior, and with a standard full text retrieval approach. 
Conclusions close the paper. 

2. Related Work on Matchmaking 

Matchmaking has been investigated in recent years under a number of perspectives and for 
different purposes, with a renovated interest as the information overload kept growing with 
the Web widespreading use. We try here to summarize some of the relevant related work. 
Vague query answering, proposed by Motro (1988), was an initial effort to overcome limi- 
tations of relational databases, using weights attributed to several search variables. More 
recent approaches along these lines aim at extending SQL with "preference" clauses, in 
order to softly matchmake data in structured databases (Kiefiling, 2002). Finin, Fritzson, 
McKay, and McEntire (1994) proposed KQML as an agent communication language ori- 
ented to matchmaking purposes. Kuokka and Harada (1996) investigated matchmaking 
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as a process that allowed potential producers/consumers to provide descriptions of their 
products/needs, either directly or through agents mediation, to be later unified by an en- 
gine identifying promising matches. Two engines were developed, the SHADE system, 
which again used KQML, and as description language KIF, with matchmaking anyway not 
relying on any logical reasoning, and COINS, which adopted classical unstructured-text in- 
formation retrieval techniques, namely the SMART IR system. Similar methods were later 
re-considered in the GRAPPA system (Veit, Muller, Schneider, & Fiehn, 2001). Classified- 
ads matchmaking, at a syntactic level, was proposed by Raman, Livny, and Solomon (1998) 
to matchmake semi-structured descriptions advertising computational resources in a fashion 
anticipating Grid resources brokering. Matchmaking was used in SIMS (Arens, Knoblock, 
& Shen, 1996) to dynamically integrate queries; the approach used KQML, and LOOM 
as description language. LOOM is also used in the subsumption matching addressed by 
Gil and Ramachandran (2001). InfoSleuth (Jacobs & Shea, 1995), a system for discovery 
and integration of information, included an agent matchmaker, which adopted KIF and 
the deductive database language LDL++. Constraint-based approaches to matchmaking 
have been proposed and implemented in several systems, e.g., PersonaLogic 3 , Kasbah 4 and 
systems by Maes, Guttman, and Moukas (1999), Karacapilidis and Moraitis (2001), Wang, 
Liao, and Liao (2002), Strobel and Stolze (2002). 

Matchmaking as satisfiability of concept conjunction in DLs was first proposed in the 
same venue by Gonzales-Castillo, Trastour, and Bartolini (2001) and by Di Sciascio et al. 
(2001), and precisely defined by Trastour et al. (2002). Sycara, Paolucci, Van Velsen, and 
Giampapa (2003) introduced a specific language for agent advertisement in the framework 
of the Retsina Multiagent infrastructure. A matchmaking engine was developed (Sycara 
et al., 2002; Paolucci, Kawamura, Payne, & Sycara, 2002), which carries out the process on 
five possible levels. Such levels exploit both classical text-retrieval techniques and semantic 
match using ©-subsumption. Nevertheless, standard features of a semantic-based system, 
as satisfiability check are unavailable. It is noteworthy that in this approach, the notion 
of plug-in match is introduced, to overcome in some way the limitations of a matching ap- 
proach based on exact matches. The approach of Paolucci et al. (2002) was later extended 
by Li and Horrocks (2003), where two new levels for matching classification were introduced. 
A similar classification was proposed — in the same venue — by Di Noia et al. (2003c), along 
with properties that a matchmaker should have in a DL-based framework, and algorithms to 
classify and semantically rank matches within classes. Benatallah, Hacid, Rey, and Toumani 
(2003) proposed the Difference Operator in DLs for semantic matchmaking. The approach 
uses Concept Difference, followed by a covering operation optimized using hypergraph tech- 
niques, in the framework of web services discovery. We briefly comment on the relationship 
between Concept Difference and Concept Abduction at the end of Section 5. An initial DL- 
based approach, adopting penalty functions ranking, has been proposed by Call, Calvanese, 
Colucci, Di Noia, and Donini (2004), in the framework of dating systems. An extended 
matchmaking approach, with negotiable and strict constraints in a DL framework has been 
proposed by Colucci, Di Noia, Di Sciascio, Donini, and Mongiello (2005), using both Con- 
cept Contraction and Concept Abduction. Matchmaking in DLs with locally-closed world 



3. http://www.PersonaLogic.com 

4. http://www.kasbah.com 
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assumption applying autoepistemic DLs has been proposed by Grimm, Motik, and Preist 
(2006). 

The need to work in someway with approximation and ranking in DL-based approaches 
to matchmaking has also recently led to adopting fuzzy-DLs, as in Smart (Agarwal & 
Lamparter, 2005) or hybrid approaches, as in the OWLS-MX matchmaker (Klusch, Fries, 
Khalid, & Sycara, 2005). Such approaches, anyway, relaxing the logical constraints, do not 
allow any explanation or automated revision service. 

Finally, it should be pointed out that matching in DLs, widely treated by Baader, 
Kiisters, Borgida, and Mc Guinness (1999) has no relation to matchmaking. In fact, in that 
work expressions denoting concepts are considered, with variables in expressions. Then 
a match is a substitution of variables with expressions that makes a concept expression 
equivalent to another. Also the more general setting of concept rewriting in DLs has no 
direct relation with matchmaking — see the discussion in Remark 1. 

3. Description Logics Basics 

In this Section we summarize the basic notions and definitions about Description Logics 
(DLs), and about Classic, the knowledge representation system our application is inspired 
by. We provide hereafter a brief guided-tour of DLs main characteristics, while the interested 
reader can refer to the comprehensive handbook by Baader et al. (2003). 

3.1 Description Logics 

Description Logics — a.k.a. Terminological Logics — are a family of logic formalisms for Knowl- 
edge Representation. All DLs are endowed of a syntax, and a semantics, which is usually 
model-theoretic. The basic syntax elements of DLs are: 

• concept names, e.g., Computer, CPU, Device, Software, 

• role names, like hasSof tware, hasDevice 

• individuals, that are used for special named elements belonging to concepts. 

Intuitively, concepts stand for sets of objects, and roles link objects in different concepts, 
as the role hasSof tware that links computers to software. We are not using individuals in 
our formalization, hence from now on we skip the parts regarding individuals. 

Formally, a semantic interpretation is a pair 2 = (A, which consists of the domain 
A and the interpretation function - x , which maps every concept to a subset of A, and every 
role to a subset of A x A. 

Basic elements can be combined using constructors to form concept and role expressions, 
and each DL has its distinguished set of constructors. Every DL allows one to form a 
conjunction of concepts, usually denoted as l~l; some DL include also disjunction U and 
complement -> to close concept expressions under boolean operations. 

Roles can be combined with concepts using 

• existential role quantification: 

e.g., Computer l~l ElhasSof tware. WordProcessor 

which describes the set of computers whose software include a word processor, and 
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• universal role quantification 
e.g., Server l~l VhasCPU. Intel 

which describes servers with only Intel processors on board. 
Other constructs may involve counting, as 

• number restrictions: 

e.g., Computer n (< 1 hasCPU) 

expresses computers with at most one CPU, and 

e.g., Computer n (> 4 hasCPU) 

describes computers equipped with at least four CPUs. 

Many other constructs can be defined, increasing the expressive power of the DL, up to 
n-ary relations (Calvanese, De Giacomo, & Lenzerini, 1998). 

In what follows, we call atomic concepts the union of concept names, negated concept 
names, and unqualified number restrictions. We define length of a concept C as the number 
of atomic concepts appearing in C. We denote the length of C as \C\. Observe that we 
consider T and _L to have zero length. We define the Quantification Nesting (QN) of a 
concept as the following positive integer: the QN of an atomic concept is 0, the QN of a 
universal role quantification VR.F is 1 plus the QN of F, and the QN of a conjunction 
C\ n C2 is the maximum between the QNs of conjoined concepts C\ and C 2 . 

Expressions are given a semantics by defining the interpretation function over each 
construct. For example, concept conjunction is interpreted as set intersection: (C n D) 1 = 
C 1 PiD 1 , and also the other boolean connectives U and when present, are given the usual 
set-theoretic interpretation of union and complement. The interpretation of constructs 
involving quantification on roles needs to make domain elements explicit: for example, 
(MR.Cf = {di G A | Vd 2 G A : (d u d 2 ) G R 1 —> d 2 G C x } 

3.2 TBoxes 

Concept expressions can be used in axioms — that can be either inclusions (symbol: C), or 
definitions (symbol: =) — which impose restrictions on possible interpretations according 
to the knowledge elicited for a given domain. For example, we could impose that monitors 
can be divided into CRT and LCD using the two inclusions: Monitor C LCDMonitor U 
CRTMonitor and CRTMonitor C -iLCDMonitor. Or, that computers for a domestic use have 
only one operating system as HomePC C (< 1 hasDS). Definitions are useful to give a 
meaningful name to particular combinations, as in Server = Computer n (> 2 hasCPU). 

Historically, sets of such axioms are called a TBox (Terminological Box). There are 
several possible types of TBoxes. General TBoxes are made by General Concept Inclusions 
(GCI) of the form C C D, where both C and Dem can be any concept of the DL. For 
general TBoxes, the distinction between inclusions and definitions disappears, since any 
definition C = D can be expressed by two GCIs C C D,D C C. On the contrary, in 
simple TBoxes — also called schemas by Calvanese (1996), and by Buchheit, Donini, Nutt, 
and Schaerf (1998) — only a concept name can appear on the left-hand side (l.h.s.) of an 
axiom, and a concept name can appear on the l.h.s. of at most one axiom. Schemas can be 
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cyclic or acyclic, where cyclicity refers to the dependency graph Gr between concept names, 
defined as follows: every concept name is a node in Gq-, and there is an arc from concept 
name A to concept name B if A appears on the l.h.s. of an axiom, and B appears (at any 
level) in the concept on the right-hand side. T is acyclic if Gr is, and it is cyclic otherwise. 
We call an acyclic schema a simple TBox (Baader et al., 2003, Ch.2). The depth of a simple 
TBox T is the length of the longest path in Gq-- Only for simple TBoxes, unfolding has 
been defined as the following process (see Appendix A for a definition): for every definition 
A = C, replace A with C in every concept; for every inclusion C, replace A with AUG 
in every concept. Clearly, such a process trasforms every concept into an equivalent one, 
where the TBox can be forgotten. However, for some TBoxes, unfolding can yield concepts 
of exponential size w.r.t. the initial concepts. When such an exponential blow-up does not 
happen, we call the TBox "bushy but not deep" (Nebel, 1990). 

The semantics of axioms is based on set containment and equality: an interpretation T 
satisfies an inclusion C C D if C 1 C D 1 , and it satisfies a definition C = D when C x = D 1 . 
A model of a TBox T is an interpretation satisfying all axioms of T . 

Observe that we make a distinction between equivalence = (used in axioms) and equality 
= symbols. We use equality to instantiate generic concept symbols with the concepts they 
stand for, e.g., when we write "... where C = A n VR.B..." we mean that the concept 
symbol C stands for the concept expression A n ^JR.B in the text. 

3.3 Reasoning Services 

DL-based systems usually provide two basic reasoning services: 

1. Concept Satisfiability: given a TBox T and a concept C, does there exist at least one 
model of T assigning a non-empty extension to C? We abbreviate satisfiability of a 
concept C w.r.t. a TBox T as C %q- -L. 

2. Subsumption: given a TBox T and two concepts C and D, is C x always contained in 
D 1 for every model 2bf T? We abbreviate subsumption between C and D w.r.t. T 
as C Q T D. 

Since C is satisfiable iff C is not subsumed by _L, complexity lower bounds for satisfiability 
carry over (for the complement class) to subsumption, and upper bounds for subsumption 
carry over to satisfiability. On the other hand, since C is subsumed by D iff C n —>D is 
unsatisfiable, subsumption is reducible to satisfiability in DLs admitting general concept 
negation, but not in those DLs in which ->D is outside the language — as in the DLs of the 
next Section. 

3.4 The System Classic 

The system Classic (Borgida, Brachman, McGuinness, & A. Resnick, 1989; Borgida & 
Patel-Schneider, 1994) has been originally developed as a general Knowledge Representation 
system, and has been successfully applied to configuration (Wright, Weixelbaum, Vesonder, 
Brown, Palmer, Berman, & Moore, 1993) and program repositories management (Devambu, 
Brachman, Selfridge, & Ballard, 1991). 

Its language has been designed to be as expressive as possible while still admitting 
polynomial-time inferences for "bushy but not deep" TBoxes. So it provides intersection of 
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c 1 nD x 
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(all R C) 


VR.C 


{di | Vd 2 : (di,d 2 ) G R x -» d 2 G C x } 
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(> n R) 
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(at-most n R) 


(< n R) 


{di | B{^2 | (di,d 2 ) G i^} < n} 



Table 1: Syntax and semantics of some constructs of Classic 



name 


system notation 


syntax 


semantics 


definition 


(createConcept A C false) 


A = C 


A 1 = C 1 


inclusion 


(createConcept A C true) 


Arc 


A 1 C C 1 


disjoint 
group 


(createConcept A\ C symbol) 
(createConcept Ak C symbol) 


disj(Ai, . . . ,A k ) 


for i = 1, . . . , k Af C C 1 
and for j = i + 1, . . . , k 

Af n Aj = 



Table 2: Syntax and semantics of the TBox Classic assertions (symbol is a name denoting 
the group of disjoint concepts) 



concepts but no union, universal but not existential quantification over roles, and number 
restrictions over roles but no intersection of roles, since each of these combinations is known 
to make reasoning NP-hard (Donini, Lenzerini, Nardi, Sz Nutt, 1991; Donini, 2003). 

For simplicity, we only consider a subset of the constructs, namely, conjunction, number 
restrictions, and universal role quantifications, summarized in Table 1. We abbreviate the 
conjunction (> n R) n (< n R) as (= n R). We omit constructs ONE-OF(-), FLLLS( V ) 
that refer to individuals, and construct SAME-AS(-,-) equating fillers in functional roles. 
The subset of Classic we refer to is known as ACM (Attributive Language with unqualified 
Number restrictions) (Donini, Lenzerini, Nardi, & Nutt, 1997b). When number restrictions 
are not present, the resulting DL is known as AC (Schmidt-Schaufi & Smolka, 1991). ACM 
provides a minimal set of constructs that allow one to represent a concept taxonomy, disjoint 
groups, role restrictions (AC), and number restrictions (M) to represent restriction son the 
number of fillers of a role. 

Regarding axioms in a TBox, Classic allows one to state a simple TBox of assertions 
of the form summarized in Table 2, where A, A\ , . . . ,A\- are all concept names. Axioms 
in the TBox are subject to the constraints that every concept name can appear at most 
once as the l.h.s. in a TBox, and every concept name cannot appear both on the l.h.s. of a 
definition and in a disjointness assertion. 

Every Classic concept can be given a normal form. Here we consider the normal form 
only for the constructs of ACM that we used in the ontologies and applications. Intuitively, 
the normal form pre-computes all implications of a concept, including — possibly — its un- 
satisfiability. The normal form can be reached, up to commutativity of the operator n, 
using well-known normalization rules, that we report in Appendix A to make the paper 
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self-contained. The normal form of an unsatisfiable concept is simply _L. Every satisfiable 
concept C can be divided into three components: C names nC§nC a u. The component C names 
is the conjunction of all concept names A\, . . . ,Ah- The component Cjj is the conjunction 
of all number restrictions, no more than two for every role (the maximum at-least and the 
minimum at-most for each role), including for every conjunct of C of the form Vi?._L, the 
number restriction (< R) in Cj. The component C a u conjoins all concepts of the form 
\/R.D, one for each role R, where D is again in normal form. We call such form Conjunc- 
tive Normal Form — CNF, in analogy with Propositional Logic — and we observe that CNF 
is unique (also said canonical), up to commutativity of conjunction. 

Moreover, the TBox in Classic can be embedded into the concepts, by expanding 
definitions, and adding the right-hand-side concepts of inclusions, and adding the negation 
of disjoint concept names — see Appendix A for more details. For instance, suppose that a 
TBox contains: 

1. the definition Server = Computer n (> 2 hasCPU), 

2. the inclusion Computer C (> 1 hasStorageDevice), 

3. and the disjointness assertion disj(kWD, Intel). 

Then, the concept ServernVhasCPU.Intel can be rewritten into Computern(> 2 hasCPU)n 
(> 1 hasStorageDevice) nVhasCPU.(lntein-iAMD), which is equivalent to the former w.r.t. 
models of the TBox. Observe that the concept name Computer is kept in the rewriting, 
since the inclusion gives only a necessary condition (> 1 hasStorageDevice). The latter 
concept can be safely conjoined to Computer-making the inclusion unnecessary — but can- 
not replace it since (> 1 hasStorageDevice) is not a sufficient condition for Computer. 
Instead, Computer n (> 2 hasCPU) replaces Server since it is a necessary and sufficient 
condition for it. The disjoint assertion generates Intel n -iAMD as the range for VhasCPU.. 
Once this rewriting has been carried over all concepts, the TBox can be safely ignored when 
computing subsumption (and satisfiability). In general, this unfolding may lead to an expo- 
nential blow-up of the TBox, making the entire computation (unfolding+subsumption) take 
exponential time (and space) in the size of the initial concepts and TBox. Yet exponential- 
time computation for subsumption is likely to be unavoidable, since even without rewriting, 
taking the TBox into account makes subsumption NP-hard (Nebel, 1990). 

The normal form of concepts can take the TBox embedding into account (see Appen- 
dix A. 2). In this case, the component C names of a Classic concept C contains concept 
names C names + and negations of concept names Cnames- 1 • 

In the following, we denote the 
CNF of a concept C w.r.t. a simple TBox T as CNF(C,T). Again, in general, the size 
of CNF(C, T) may be exponential w.r.t. the size of C and T. However, when T is fixed, 
CNF(C, T) has polynomial-size w.r.t. the size of C i.e., the exponential increase comes only 
from the TBox unfolding. In fact, if k is the maximum size of an unfolded concept name 
(a constant if T is fixed), the size of CNF(C,T) can be at most k times the size of C. We 
use this argument later in the paper, to decouple the complexity analysis of our reasoning 
methods for matchmaking from the complexity raised by the TBox. 

To ease presentation of what follows in the next Sections, we adopt a simple reference 
ontology, pictured in Figure 1, which is used throughout the paper. To keep the represen- 
tation within ACM, we modeled memory quantities with number restriction, e.g., 20GB as 
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CRTmonitor 
LCDmonitor 



Monitor 



n=J_ 



DVDRecorder 
FloppyDisk ^ C StorageDevice 
HardDisk 



> C Device 



Linux 
Solaris 
Windows2000 
WindowsXp 



> C OperatingSystem 

Browser 
WordProcessor 



> C Software 



PDA 
PC 



C Computer 



Computer C (> 1 hasStorageDevice) l~l VhasStorageDevice.StorageDevice l~l 
VhasSof tware. Software n (> 1 ram) 
HomePC C PC n (> 1 hasSof tware) n 

(= 1 hasOS) n (> 1 hasMonitor) l~l VhasMonitor.Monitor 
Server C Computer n (> 2 hasCPU) n 

Vram.(> 512 mb) n VhasStorageDevice.(> 20000 mb) 



Figure 1: Reference Ontology used for examples 



(> 20000 mb). For reasoners specialized for ACM, this is not a problem, since a number n 
is never expanded as n fillers (Borgida & Patel-Schneider, 1994; Donini et al., 1997b). For 
more expressive DLs, Concrete Domains (Lutz, 1999) should be employed to represent such 
quantities. 



4. Semantic Matchmaking Using Description Logics 

Matchmaking is a widely used term in a variety of frameworks, comprising several — quite 
different — approaches. We begin this Section trying to provide a generic and sound defini- 
tion of matchmaking. 

Matchmaking is an information retrieval task whereby queries (a.k.a. de- 
mands) and resources (a.k.a. supplies) are expressed using semi-structured data 
in the form of advertisements, and task results are ordered (ranked) lists of those 
resources best fulfilling the query. 

This simple definition implies that — differently from classical unstructured-text Information 
Retrieval systems — some structure in the advertisements is expected in a matchmaking 
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system, and matchmaking does not consider a fixed database-oriented relational structure. 
Furthermore, usually database systems provide answers to queries that do not include a 
relevance ranking, which should be instead considered in a matchmaking process. 

Semantic matchmaking is a matchmaking task whereby queries and resources 
advertisements are expressed with reference to a shared specification of a con- 
ceptualization for the knowledge domain at hand, i.e., an ontology. 

From now on, we concentrate on semantic matchmaking in marketplaces, adopting specific 
terminology, to ease presentation of the approach. Nevertheless our approach applies to 
generic matchmaking of semantically annotated resources. 

We note that all definitions in this Section apply to every DL that can be used to 
describe a marketplace (supplies, demands, background knowledge). We denote by C such 
a generic DL. We suppose that a common ontology for supplies and demands is established, 
as a TBox T in C. Now a match between a supply and a demand could be evaluated 
according to T. 

First of all, we remark that a logic-based representation of supplies and demands calls 
for generally Open-world descriptions, that is, the absence of a characteristic in the descrip- 
tion of a supply or demand should not be interpreted as a constraint of absence. Instead, 
it should be considered as a characteristic that could be either refined later, or left open 
if it is irrelevant for a user. Note that by "generally open" we mean that some specific 
characteristic might be declared to be closed. However, such a closure should be made 
piecewise, using some known declarative tool devised in Knowledge Representation for non- 
monotonic reasoning, such as Defaults in DLs (Baader & Hollunder, 1992), Autoepistemic 
DLs (Donini, Nardi, & Rosati, 1997a), Circumscription in DLs (Bonatti, Lutz, & Wolter, 
2006) etc. 

An analysis of recent literature allows to categorize the semantic matchmaking process 
between a supply Sup and a demand Dem w.r.t. a TBox T in five distinct classes: 

• exact match: Sup =t Dem, i.e., Sup C-r Dem and Dem Sup, which amounts 
to a perfect match, regardless — in a semantic based environment — of syntactic differ- 
ences, i.e., Sup and Dem are equivalent concepts (Di Sciascio et al., 2001; Gonzales- 
Castillo et al., 2001). 

• full match: Sup C7- Dem, which amounts to the demand being completely fulfilled 
by the available supply, i.e., Sup has at least all features required by Dem, but not 
necessarily vice versa, being the matchmaking process not symmetric (Di Noia et al., 
2003c); this kind of match is also named subsume match by Li and Horrocks (2003). 

• plug-in match: Dem C7- Sup; it corresponds to demand Dem being sub-concept of 
supply Sup,i.e., Dem is more specific than Sup (Sycara et al., 2002; Li Sz Horrocks, 
2003). 

• potential match: DemHSup -L, which corresponds to supply and demand having 
something in common and no conflicting characteristics (Di Noia et al., 2003c). This 
relation is also named intersection- satis fiable by Li and Horrocks (2003). 
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• partial match: Dem n Sup C-j- _L, which amounts to the presence of conflict between 
the demand and the available supply (Di Noia et al., 2003c). This relation is also 
named disjoint by Li and Horrocks (2003) 5 . 

We stress that demands could be classified in the same way w.r.t. a given supply, when 
it's the supplier's turn to look into the marketplace to find potential buyers. Hence, in the 
rest of the paper we use the term offer — denoted by the symbol D — to mean either a supply 
Sup or a demand Dem, and the term counteroffer — denoted by C — to mean, respectively, 
the demand Dem or the supply Sup that could match D. 

Such a classification is still a coarse one, relying directly on known logical relations 
between formulae. In fact, the result of matchmaking should be a rank of counteroffers, 
according to some criteria — possibly explicit — so that a user trusting the system would 
know whom to contact first, and in case of failure, whom next, and so on. Such a ranking 
process should satisfy some criteria that a Knowledge Representation approach suggests. 

We formulate ranking requirements by referring to properties of penalty functions. 

Definition 1 Given a DL C, two concepts C,D € C, and a TBox T in C, a penalty 
function is a three- arguments function p(C, D,T), that returns a null or positive integer. 

We use penalty functions to rank counteroffers C for a given demand (or supply) D w.r.t. a 
TBox T . Intuitively, for two given counteroffers C±, C2 in the marketplace, if p{C\, D, T) < 
p(C*2, D, T) then the issuer of offer D should rank C\ better than C2 when deciding whom to 
contact first. Clearly, a 0-penalty should be ranked best, and counteroffers with the same 
penalties should be ranked breaking ties. The first property we recall is Non-symmetric 
evaluation of proposals. 

Definition 2 A penalty function p(- ,-, ■) is non-symmetric if there exist concepts C,D and 
a TBox T such that p(C, D, T) / p(D, C, T) . 

This property is evident when all constraints of D are fulfilled by C but not vice versa. 
Hence, C should be among the top-ranked counteroffers in the list of potential partners of 
D, while D should not necessarily appear at the top in the list of potential partners of C. 
So, a penalty function p(-, •, •) should not be expected to be a metric distance function. 

Secondly, if logic is used to give some meaning to descriptions of supplies and demands, 
then proposals with the same meaning should be equally penalized, independently of their 
syntactic descriptions. 

Definition 3 A penalty function p{-, -, •) is syntax independent if for every triple of con- 
cepts C±,C2,D, and TBoxT, when T \= C\ = C2 then p(Ci, D,T) = p(C2, D,T), and the 
same holds also for the second argument , i.e., p(D, C±,T) = p(D, C2, T) 

5. We note that preferring the term "partial match" instead of "disjoint", we stress that the match may 
still be recoverable, while disjoint is usually meant as a hopeless situation. Moreover, "disjoint" and 
"intersection satisfiable" refer to the set-theoretic semantics of concepts in Description Logics, which 
is quite hidden and far from the original problems of matchmaking. In a word, they are technology- 
oriented and not problem-oriented. For instance, if one used Propositional Logic, or Three- valued Logic 
for modeling matchmaking, those terms would make no sense. 
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Clearly, when the logic admits a normal form of expressions — as CNF or DNF for prepo- 
sitional logic, or the normal form of concepts for DLs defined in the previous Section — using 
such a normal form in the computation of p(-, •, •) ensures by itself syntax independence. 

Penalties should enjoy some desirable properties w.r.t. subsumption. For reasons ex- 
plained below, we divide penalty functions for ranking potential matches from those for 
ranking partial (conflicting) matches. 

Definition 4 A penalty function for potential matches is monotonic over subsumption 

whenever for every issued offer D, for every pair of counteroffers C\ and C 2 , and TBox T , 
if C\ and C2 are both potential matches for D w.r.t. T, and (C\ C 2 ), then p(C\, D,T) < 
P(C 2 ,D,T) 

Intuitively, the above definition could be read of as: if C\ C-r C2 then C\ should be penalized 
(and then ranked) either the same, or better than C 2 - In a phrase, A ranking of potential 
matches is monotonic over subsumption if more specific means better. A dual property 
could be stated for the second argument: if D\ Qt D2 then a counteroffer C is less likely to 
fulfill all characteristics required by D\ than D2. However, since our scenario is: "given an 
issuer of a proposal D looking for a match in the marketplace, rank all possible counteroffers 
Ci, C2, ■ ■ ■ , from the best one to the worst", we do not deal here with this duality between 
first and second argument of p(-, ■, •). 

When turning to partial matches, in which some properties are already in conflict be- 
tween supply and demand, the picture reverses. Now, adding another characteristic to an 
unsatisfactory proposal may only worsen this ranking (when another characteristic is vio- 
lated) or keep it the same (when the new characteristic is not in conflict). Note that this 
ranking should be kept different from the ranking for potential matches. After all, accepting 
to discard one or more characteristics that we required is much worse than deciding which 
proposal try first among some potential ones. 

Definition 5 A penalty function for partial matches is antimonotonic over subsumption 

whenever for every issued offer D, for every pair of counteroffers C\ and C2, and TBox T , 
if C\ and C2 are both partial matches for D w.r.t. T, and (C\ C2), then p{C\, D,T) > 
P(C 2 ,D,T) 

Intuitively, if C\ C 2 then C\ should be penalized (and then ranked) either the same, 
or worse than C 2 - In other words, A ranking of partial matches is antimonotonic over 
subsumption if more specific means worse. The same property should hold also for the 
second argument, since concept conjunction is commutative. 

When we need to distinguish between a penalty function for potential matches and one 
for partial matches, we put a subscript C in the former (as pc) and a subscript _L for the 
latter (as in q±). 

Clearly, the above requirements are very general, and leave ample room for the definition 
of penalty functions. A more subtle requirement would be that penalties should not change 
when irrelevant details are added, e.g., if a second-hand computer is requested in a demand 
Dem, with no specification for the brand of the CPU, then a supply Sup should be penalized 
the same as another offer Supfl VhasCPU.Intel. However, instead of delving into irrelevance 
and other logic-related issues directly from penalties, we now borrow well-known logical 
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reasoning frameworks in propositional knowledge representation. Such a detour will give us 
a sound and declarative way of denning penalties, dealing with irrelevance as a byproduct, 
and more generally bring well-studied non-standard reasoning techniques into matchmaking. 

5. Concept Abduction 

Abduction (Peirce, 1955) is a well known form of commonsense reasoning, usually aimed at 
finding an explanation for some given symptoms or manifestations. Here we introduce Con- 
cept Abduction in DLs, showing how it can model potential matchmaking in a DL setting. 
Following the notation proposed by Eiter and Gottlob (1995), we recall that a Propositional 
Abduction Problem is a triple (H, M, T) where H (Hypotheses) and M (Manifestations) 
are sets of literals, and T (Theory) is a set of formulae. A solution for (H, M, T) is an Ex- 
planation E C H such that T U E is consistent, and T U E \= M. We adapt this framework 
to DLs as follows. 

Definition 6 Let C be a DL, C , D, be two concepts in C, and T be a set of axioms in 
C, where both C and D are satisfiable in T. The Concept Abduction Problem (CAP) for 
a given (C,C,D,T) , is finding, if possible, a concept H £ C such that C l~l H [2t -L, and 
CnH C r D. 

We use V as a symbol for a generic CAP, and we denote with SOL(V) the set of all 
solutions to a CAP V . Observe that in the definition, we limit the inputs of a CAP to 
satisfiable concepts C and D, since C unsatisfiable implies that the CAP has no solution 
at all, while D unsatisfiable leads to counterintuitive results (e.g., —*C would be a solution 
in that case). As Propositional Abduction extends implication, Concept Abduction ex- 
tends concept subsumption. But differently from propositional abduction, we do not make 
any distinction between manifestations and hypotheses, which is usual when abduction is 
used for diagnosis. In fact, when making hypotheses about e.g., properties of goods in 
e-marketplaces, there is no point in making such a distinction. This uniformity implies that 
there is always the trivial solution D to a non-trivial CAP (C,C, D,T), as stated more 
formally as follows. 

Proposition 1 Let £ be a DL, let C, D be concepts in C, and T an C- TBox. Then C\lD %q~ 
J_ if and only if D € SOL((C, C, D, T)). 

Proof. If C n D is satisfiable in T, then D fulfills both requirements of Def. 6, the first 
one by hypothesis and the second one because C U D C7- D is a tautology. On the other 
hand, if D e SOL((C, C, D, T)) then CTlfl _L by definition. □ 

A simple interpretation of this property in our application domain, i.e., matchmaking, 
is that if we hypothesize for the counteroffer C exactly all specifications in D, then the 
counteroffer trivially meets given specifications — if it was compatible anyway. However, not 
all solutions to a CAP are equivalent when using Concept Abduction for matchmaking. To 
make a simple example, suppose that already C C7- D. Then, both H\ = D and H2 = T 
(among others) are solutions of {C,C,D,T). Yet, the solution H2 = T tells the issuer of 
D that C already meets all of D's specifications, while the solution Hi = D is the least 
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informative solution from this point of view. Hence, if we want to use abduction to highlight 
most promising counteroffers, "minimal" hypotheses must be defined. 

Definition 7 Let V ={C,C,D,T) be a CAP. The set SOLr(V) is the subset of SOL(V) 
whose concepts are maximal under C-r. The set SOL<(V) is the subset of SOL(V) whose 
concepts have minimum length. 

Clearly, being maximal w.r.t. C-7- is still a minimality criterion, since it means that no 
unnecessary hypothesis is assumed. It can be proved that the two measures are incompa- 
rable. 

Proposition 2 There exists a CAP V such that the two sets SOLq(V) and SOL<(P) are 
incomparable. 

Proof. It is sufficient to consider D = A 1 n A 2 n A3, C = A 1 , and T = {B C A 2 r\A 3 }. The 
logic is even propositional. Then A 2 nA 3 £ SOLq((£, C, D, T}), B € SOL<((C,C, D,T)), 
and neither solution is in the other set. □ 

The proof highlights that, although <-minimality could be preferable for conciseness, it 
is heavily dependent on T. In fact, for every concept H € SOL(V), it is sufficient to add the 
axiom A = H to get a <-minimal solution A. On the other hand, also C7--maximality has 
some drawbacks: if concept disjunction U is present in C, then there is a single C^-maximal 
solution of V, that is equivalent to the disjunction of all solutions in SOL(V) — not a very 
useful solution. Making an analogy with Abduction-based Diagnosis (Console, Dupre, & 
Torasso, 1991), we could say that the disjunction of all possible explanations is not a very 
informative explanation itself — although it is maximal w.r.t. implication. We note that 
finding a <-minimal solution is NP-hard for a TBox of depth 1, by a simple reduction from 
Set Covering (Colucci, Di Noia, Di Sciascio, Donini, Sz Mongiello, 2004). 

Remark 1 It is interesting to analyze whether concept minimal-rewriting techniques — as 
defined by Baader, Kiisters, and Molitor (2000) — could be employed for computing some 
minimal concept abduction, trying to rewrite C n D. The answer is definitely negative for 
minimal length abduction: the length-minimal solution B in the proof of Proposition 2 
could not be obtained by rewriting C n D = A\ n A\ n A 2 n A 3 . In fact, A\ n B is not 
an equivalent rewriting of the former concept. Regarding C^-maximality the answer is 
more indirect. In fact, present rewriting techniques do not keep a subconcept fixed in the 
rewriting process. So consider a CAP in which D = A±, C = A 2 , and T = {B = A\ n A 2 }. 
The only equivalent minimal rewriting of C n D is then B, in which a solution cannot be 
identified since B cannot be separated into a concept C — the original one — and a concept 
H that is a solution of the CAP. It is open whether future extensions of rewriting might 
keep a concept fixed, and cope with this problem. 

A third minimality criterion is possible for DLs which admit CNF, as for C = ACM. 

Definition 8 Let V =(£, C, D, T) be a CAP in which C admits CNF, and assume that 
concepts in SOL(V) are in CNF. The set SOL n (V) is the subset of SOL(V) whose concepts 
are minimal conjunctions, i.e., if C G SOL n (V) then no sub- conjunction of C (at any level 
of nesting) is in SOL(V). We call such solutions irreducible. 
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It turns out that n-minimality includes both C7--maximality and <-minimality. 

Proposition 3 For every CAPV in which C admits a CNF, both SOL\z(V) and SOL<(V) 
are included in SOL n (V). 

Proof. By contraposition, if a concept H is not n-minimal then there is another concept 
H' — a sub-conjunction of H — which is an n-minimal solution. But \H'\ < \H\, hence H is 
not length-minimal. The same for C^-maximality: since every sub-conjunction of a concept 
H in CNF subsumes H, if H is not n-minimal it is not n^-maximal either. □ 

The proof of Proposition 2 can be modified to show that minimum-length abduced 
concepts are not unique: it is sufficient to add another axiom B' C A 2 n ^3 to obtain 
another minimum-length solution B'. A less obvious result is that also subsumption- 
maximal solutions are not unique, at least in non-simple TBoxes: Let V = (£, C, D, T) 
with T = {A 2 n A 3 C Ai}, C = A 3 ,D = Ai. Then both A-y and A 2 are C r -maximal 
solutions. 

5.1 Irreducible Solutions in ./LCTV-simple TBoxes 

We assume here that the TBox T of a CAP V = (£, C, D, T) is always a simple one. Finding 
an irreducible solution is easier than finding a <-minimal or a n^-maximal solution, since a 
greedy approach can be used to minimize the set of conjuncts in the solution. For example, 
starting from C \~\ D, we could delete one redundant conjunct at a time (at any level of 
role quantification nesting) from D, using \D\ calls to a subsumption-check procedure. 
However, such an algorithm would be interesting only for theoretical purposes. Instead, we 
adapt a structural subsumption algorithm (Borgida & Patel-Schneider, 1994) that collects 
all concepts H that should be conjoined to C in order for C n H to be subsumed by D. 
The algorithm operates on concepts in CNF. In the following algorithm, we abbreviate the 
fact that a concept A appears as a conjunct of a concept C with A € C (thus extending the 
meaning of € to conjunctions of concepts). 

Algorithm findIrred(V); 

input: a CAP V = (C, C,D,T), with C =ACM, simple T, C and D in CNF w.r.t. T 

output: concept H G SOL n (V) (where H = T means that C C D) 

variables: concept H 
begin 

H:=T; 

0. if DnC Qt -L 

return _L; 

1. for every concept name A in D 
1.1 if AgC 

then H := H n A; 

2. for every concept (> n R) G D 

2.1 such that there is no concept (> m R) € C with m > n 
H := Hn (> n J?); 

3. for every concept (< n R) € D 
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3.1 such that there is no concept (< mi!)eC with m < n 
H : = Hn (< n R); 

4. for every concept VR.E G D 
4.1 if there exists VR.F G C 

4.1.1 then H := H H \/R.findIrred((ACN ', F, E,T)); 

4.1.2 else H := H H\/R.E; 

/* now G SOL(P), but it might be reducible */ 

5. for every concept Hi e H 

if # without Hi is in SOL{V) 
then delete iij from if; 

6. return H; 
end. 

Theorem 1 Given a CAP V , if findIrred(V) returns the concept H , with H ^ _L, then H 
is an irreducible solution ofV. 

Proof. We first prove that before Step 5, the computed concept H is in SOL(V), that is, 
both C n H -L and C n H C-j- D hold. In fact, observe that CNF(D, T) C if, since all 
conjuncts of if come from some conjunct of CNF(D, T). Hence, D C7- if since CNF(D, T) 
is equivalent to Z) in the models of T. Adding C to both sides of the subsumption yields 
CUD Qr CnH, and since we assume that CUD _L, also CUH _L. This proves the 
first condition for H G SOL(V). Regarding the condition CnH C-j- D, suppose it does not 
hold: then, at least one conjunct of CNF(D,T) should not appear in CNF(C \~\ H,T). But 
this is not possible by construction, since H contains every conjunct which is in CNF(D, T) 
and not in CNF(C,T). Therefore, we conclude that H G SOL(V). Once we proved that 
the H computed before Step 5 is a solution of V, we just note that Step 5 deletes enough 
conjuncts to make H an irreducible solution. □ 

The first part of algorithm (before Step 5) easily follows well-known structural subsump- 
tion algorithms (Borgida & Patel- Schneider, 1994). Step 5 applies a greedy approach, hence 
the computed solution, although irreducible, might not be minimal. 

We explain the need for the reducibility check in Step 5 with the help of the following 
example. 

Example 1 Let T = {A x C A 2 , A 3 C A 4 }, and let C = A 3 , D = A x n A 4 . Then C is the 
propositional part of AC The normal form for C is C = A^nA^, while D' = A\ n A2 V\A±. 
Then before Step 5 the algorithm computes H = A\ n A2, which must still be reduced to 
A\. It is worth noticing that H is already subsumption-maximal since H =t A\. However, 
n-minimality is a syntactic property, which requires removal of redundant conjuncts. 

As for complexity, we aim at proving that finding an irreducible solution is not more 
complex than subsumption in ACM. A polynomial algorithm (w.r.t. the sizes of C, D 
and T) cannot be expected anyway, since subsumption in AC (the sublanguage of ACM 
without Number Restrictions) with a simple T is coNP-hard (Nebel, 1990; Calvanese, 1996). 
However, Nebel (1990) argues that the unfolding of the TBox is exponential in the depth of 
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the hierarchy T; if the depth of T grows as 0(log |T|) as the size of T increases — a "bushy 
but not deep" TBox — then its unfolding is polynomial, and so is the above algorithm. 

More generally, suppose that T is fixed: this is not an unrealistic hypothesis for our 
marketplace application, since T represents the ontology of the domain, that we do not 
expect to vary while supplies and demands enter and exit the marketplace. In that case, we 
can analyze the complexity of findlrred considering only C and D for the size of the input 
of the problem. 

Theorem 2 LetV = (C, C, D, T) be a CAP, with C =ACM, and T a simple TBox. Then 
finding an irreducible solution to V is a problem solvable in time polynomial in the size of 
C and D. 

We note that the problem of the exponential-size unfolding might be mitigated by Lazy 
Unfolding (Horrocks & Tobies, 2000). Using this technique, concept names in the TBox are 
unfolded only when needed. 

5.2 Abduction-Based Ranking of Potential Matches 

We define a penalty function for potential matches based on the following intuition: the 
ranking of potential matches should depend on how many hypotheses have to be made on 
counteroffers in order to transform them into full matches. 

Definition 9 Given a simple TBox T in ACM, we define a penalty function for the po- 
tential match of a counteroffer C given an offer D, where both C and D are concepts in 
ACM, as follows: 

p Q (C,D,T) = | findlrred {(ACM, CNF{C,T), CNF(D,T),Q)))\ (1) 

Note that, when computing pc, a concept H is actually computed by findlrred as an 
intermediate step. This makes it easy to devise an explanation facility, so that the actual 
obtained ranking can be immediately enriched with its logical explanation; thus improving 
users' trust and interaction with the matchmaking system. 

We now prove that p\z is in accordance with properties higlighted in the previous Section. 
Since the computation of Formula (1) starts by putting concepts C,D in normal form, we 
recall that the normal form of C can be summarized as C narnes l~l Cjj n C a u, and similarly for 
D. Without ambiguity, we use the three components also as sets of the conjoined concepts. 

Theorem 3 The penalty function p\_ is (i) non- symmetric, (ii) syntax independent, and 
(Hi) monotonic over subsumption. 

Proof. (i) Non-symmetricity is easily proved by providing an example: p^(A, T,0) ^ 
pc(T, A, 0). In fact, findlrred ((ACM, A, T, 0)) finds Hx = T as a solution (AQT without 
further hypothesis) while findlrred ((ACM, T, A, 0)) finds H2 = A. Recalling that |T| = 0, 
while I A I = 1, we get the first claim. 

(ii) Syntax independence follows from the fact that normal forms are used in Formula (1), 
and as already said normal forms are unique up to commutativity of conjunction. 
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(iii) Monotonicity over subsumption is proved by analyzing the conditions for subsump- 
tion in ACM. A concept C is subsumed by a concept C whenever all conditions below 
hold. For each condition, we analyze the changes in the behavior of ftndlrred , proving that 
the provided solution H just adds other conjuncts. Recall that monotonicity over sub- 
sumption is applied only to potential matches, hence we assume that both C and C' are 
consistent with D. Since findlrred is recursive, the proof is also by induction on the quan- 
tification nesting (QN) of C. For C 1 having QN equal to 0, C can only be a conjunction 
of atomic concepts — names, negated names, number restrictions. Then the conditions for 
subsumption are the following: 

• The first condition is that C names+ C C' names+ . Hence, in Step 1.1 of findlrred, the 
number of concept names that are added to H' — with respect to names added to H— 
can only decrease, and so \H'\ < \H\ considering names. Regarding negated names, 
observe that they do not contribute to the solution of findlrred , since they come from 
a disjointness axiom and a positive name (that contributes). 

• The second condition is that for every number restriction in Cjj, either the same 
number restriction appears in C'j, or it is strengthened (an at-least increases, an at- 
most decreases) in C'jj. Hence, number restrictions added by Steps 2.1 and 3.1 to H' 
can be either as many as those added to H, or less. Again, also considering number 
restrictions \H'\ < \H\. 

The above two cases prove the basis of the induction (C' with QN equal to 0). Suppose now 
the claim holds for concepts C with QN n or less, and let C have a QN of n + 1. Clearly, 
in this case C has at least one universal role quantification — call it VR.F'. The condition 
for subsumption between C and C is the following: 

• Either for every universal role quantification VR.F in C over the same role R, it must 
hold F' Qr F, or there is no universal role quantification on R in C. In the former 
case, observe that findlrred is recursively called 6 in Step 4.1.1 with arguments F, E, 
and F', E; we call / and I', respectively, the solutions returned by findlrred. Observe 
that the QN of F' is n or less, hence by inductive hypothesis |/'| < |/|. Since Step 4.1.1 
adds \/R.I' and MR.I to H' and H, again \H'\ < \H\. If instead there is no universal 
role quantification on R in C, Step 4.1.2 adds VR.E to H. If also C does not contain 
any role quantification on R, then Step 4.1.2 adds MR.E also to H' , then H' cannot 
be longer than H in this case. If a role quantification VR.F' is in C , then Step 4.1.1 
makes a recursive call with arguments F',E. In this case, the solution returned I' 
has length less than or equal to \E\, hence the length of H' cannot be longer than the 
length of H also in this case. 

In summary, if C C then in no case the length of H' increases with respect to the 
length of H. This proves the monotonicity over subsumption of pr- □ 

Intuitively, we could say that monotonicity over subsumption for potential matches means 
"the more specific C is, the lower its penalty, the better its ranking w.r.t. D". More 

6. findlrred is called only once, because concepts in CNF have at most one universal role quantification 
over any role R. 
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precisely — but less intuitively — we should say that "the rank of C w.r.t. D cannot worsen 
when C is made more specific". Hence, given an offer D, a TBox T, a sequence of in- 
creasingly specific counteroffers C\ C 2 3r C3 3r ■ ■ ■ are assigned to a sequence of 
non-increasing penalties p^(C\,D, T) > p\ = (C2, D, T) > pq(C%, D,T) > ... We now prove 
that such sequences are well-founded, with bottom element zero, reached in case of sub- 
sumption. 

Proposition 4 pq(C,D,T) = if and only if C D. 

Proof. Recall from Section 3.1 that T and _L are the only concepts of length zero, and 
findlrred returns _L if and only if C and D are not in a potential match (Step in findlrred). 
Hence, pc(C, D,T) = if and only if the concept whose length is computed in Formula (1) 
is T. By construction of findlrred, T is returned by the call 

findlrred ((ACM, CNF(C, T), CNF(D, T), 0)) if and only if CNF(C, T) C CNF(D, T), which 
holds (see Borgida & Patel-Schneider, 1994) if and only if C Qr D. □ 

Moreover, we could also prove that adding to C details that are irrelevant for D leaves the 
penalty unaffected, while adding to C details that are relevant for D lowers C"s penalty. 

Note also that in Formula (1) we take T into account in the normal form of C,D, but 
then we forget it — we use an empty TBox — when calling findlrred. We explain such a choice 
with the aid of an example. 

Example 2 Given T = {^4 C A± l~l A2}, let D = A be a Demand with the two following 
supplies: d = A 2 , C 2 = T. Observe that CNF(D,T) = An A x n A 2 , CNF(C\,T) = 
A 2 , CNF(C 2 ,T) = T. If we used the following formula to compute the penalty 

p'(C, D, T) = \findlrred ((ACM, C, D, 0))| (2) 

and ran the algorithm findlrred ((ACM, C\,D, T)) and findlrred ((ACM , C 2 ,D, T)), before 
Step 5 we would get, respectively, 

Hi = AiHA 
H 2 = Ai n A 2 n A 

and after Step 5 findlrred would return H[ = H 2 = A, hence C\ and C 2 would receive 
the same penalty. However, we argue that C\ is closer to D than C 2 is, because it con- 
tains a characteristic (^2) implicitly required by D, while C 2 does not. If instead we call 
findlrred ((ACM, CNF(C ± ,T), CNF(D,T),®)) and 

findlrred ((ACM, CNF(C 2 ,T), CNF(D, T), 0)), we get the solutions H x and H 2 above— and 
Step 5 does not delete any conjunct, since T = 0. Therefore, C\ gets penalty 2, while C 2 
gets penalty 3, highlighting what is more specified in C\ w.r.t. C 2 . 

More generally, we can say that the reducibility step (Step 5 in findlrred) flattens a solution 
to its most specific conjuncts, leaving to the TBox the implicit representation of other 
characteristics, both the ones already present in the supply and those not present. Therefore, 
making an empirical decision, we consider the TBox in the normal form of C and D, but 
we exclude it from further reductions in Step 5 of findlrred. 
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Remark 2 Although the definition of Concept Abduction could appear similar to Concept 
Difference, it is not so. We note that generically speaking, the name "Concept Abduction" 
appeals to logic, while "Concept Difference" appeals to algebra (although Difference has 
multiple solutions when C includes universal role quantification). More precisely, we recall 
(Teege, 1994) that difference is defined as: C — D = max E {E G L : (E n D) = C} provided 
that CCD. A more specialized definition of difference (Brandt, Kiisters, &; Turhan, 2002) 
refers only to DLs ACC and ACS. It is defined as: C — D = min^{£ G L : (E n D) = 
(CnD)} — where C,E G ACC, D G ACS , and minimality is w.r.t. a preorder ^ on a specific 
normal form which extends CNF to ACC. No TBox is taken into account. 

Instead, the solution of a CAP (C,C, D,T) does not require that C C-j- D, but only that 
CnD [2t -L- In general, when D Qr C if we let H = D - C in a CAP V = (C, C, D, T) 
we get those solutions for which C V~\ H = D — which obviously are not all solutions to V. 
Hence D — C C SOL(V), but not vice versa (see the proof of Proposition 2 for an example). 
When C [2r D this comparison is not even possible, since D — C is undefined. However, in 
a generic setting, e.g., in an e-commerce scenario, subsumption between demand and supply 
is quite uncommon; most of offers are such that neither subsumes the other. Because of this 
greater generality, for our specific application to matchmaking, Concept Abduction seems 
more suited than Concept Difference to make a basis for a penalty function. 

6. Concept Contraction 

If D l~l C is unsatisfiable in T, but the demander accepts to retract some of -D's constraints, 
partially matching supplies may be reconsidered. However, other logic-based approaches 
to matchmaking by Trastour et al. (2002), Sycara et al. (2002), Li and Horrocks (2003) 
usually exclude the case in which the concept expressing a demand is inconsistent with the 
concept expressing a supply, assuming that all requirements are strict ones. In contrast, 
we believe that inconsistent matches can still be useful, especially in e-marketplaces. In 
fact, partial (a.k.a. disjoint) matches can be the basis for a negotiation process, allowing 
a user to specify negotiable requirements — some of which could be bargained in favor of 
other. Such a negotiation process can be carried out in various ways adopting approaches 
to matchmaking not based on logic {e.g., Strobel & Stolze, 2002), but also, as shown in 
practice by Colucci et al. (2005), using Belief Revision. In fact, the logical formalization 
of conflicting matches, aimed at finding still "interesting" inconsistent matches without 
having to revert to text-based or hybrid approaches, can be obtained exploiting definitions 
typical of Belief Revision. In accordance with Gardenfors (1988) formalization, revision of 
a knowledge base K, with a new piece of knowledge A is a contraction operation, which 
results in a new knowledge base KT A such that /C^ ¥= ->A, followed by the addition of A 
to /C^ — usually modeled by conjunction. We call Concept Contraction our adaptation of 
Belief Revision to DLs. 

Starting with CnD unsatisfiable in a TBox T, we model with Concept Contraction 
how, retracting requirements in C, we may still obtain a concept K (for Keep) such that 
K n D is satisfiable in T. Clearly, a user is interested in what he/she must negotiate on to 
start the transaction — a concept G (for Give up) such that C = G n K. 
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For instance, with reference to the ontology in Figure 1, if a user demands Dem and a 
supplier offers Sup, where Dem and Sup are described as follows: 

Dem = HomePC n VhasMonitor.LCDmonitor 
Sup = HomePC n VhasMonitor.CRTmonitor 

it is possible to check that Sup n Dem is unsatisfiable. This is a partial match. Yet, in this 
case, if the demander gives up the concept G = VhasMonitor.LCDmonitor and keeps the 
concept K = HomePC, K n Sup is satisfiable, hence K now potentially matches Sup. 
More formally we model a Concept Contraction problem as follows. 

Definition 10 (Concept Contraction) Let C be a DL, C , D, be two concepts in C, and 
T be a set of axioms in C, where both C and D are satisfiable in T . A Concept Contraction 
Problem ( CCP), denoted as (C, C, D, T), is finding a pair of concepts (G, K) € C x C such 
that T |= C = G\lK , and K\~\D is satisfiable in T. We call K a contraction of C according 
to D and T . 

We use Q as a symbol for a CCP, and we denote with SOLCCP(Q) the set of all 
solutions to a CCP Q. Observe that as for concept abduction, we rule out cases where 
either C or D are unsatisfiable, as they correspond to counterintuitive situations. We note 
that there is always the trivial solution (G, K) = (C, T) to a CCP. This solution corresponds 
to the most drastic contraction, that gives up everything of C. On the other hand, when 
C n D is satisfiable in T, the "best" possible solution is (T, C), that is, give up nothing. 

As Concept Abduction extends Subsumption, Concept Contraction extends satisfiability — 
in particular, satisfiability of a conjunction C n D. Hence, results about the complexity of 
deciding Satisfiability of a given concept carry over to Contraction. 

Proposition 5 Let C be a DL containing AC, and let Concept Satisfiability w.r.t. a TBox 
in C be a problem C-hard for a complexity class C. Then deciding whether a given pair of 
concepts (G, K) is a solution of a CCP Q =(C, C, D, T) is C-hard. 

Proof. A concept E £ C is satisfiable w.r.t. a TBox T if and only if the CCP (C, C, D, T) 
has the solution (T, C), where C = \/R.E and D = 3R.T. Then, C should contain at least 
universal role quantification (to express MR.E), unqualified existential role quantification 
(to express 3R.T), conjunction (to express that C = G n K) and at least the unsatisfiable 
concept _L (otherwise every concept is satisfiable, and the problem trivializes). The mini- 
mal, known DL containing all such constructs is the DL AC. □ 

This gives a lower bound on the complexity of Concept Contraction, for all DLs that 
include AC. For DLs not including AC, note that if the proof showing C-hardness of 
satisfiability involves a concept with a topmost n symbol, the same proof could be adapted 
for Concept Contraction. 

Obviously, a user in a marketplace is likely to be willing to give up as few things as 
possible, so some minimality in the contraction G must be defined. We skip for conciseness 
the definitions of a minimal-length contraction and subsumption-maximal contraction, and 
define straightforwardly conjunction-minimal contraction for DLs that admit a normal form 
made up of conjunctions. 
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Definition 11 Let Q ={£, C, D, T) be a CCP in which £ admits a CNF. The set SOLCCP n (Q) 
is the subset of SOLCCP(Q) with the following property: if (G,K) € SOLCCPn(Q) then 
for no sub- conjunction G' of G it holds (G',K) € SOLCCP(Q). We call such solutions 
irreducible. 

6.1 Number-Restriction Minimal Contractions 

In what follows we focus on a specific class of irreducible solutions for a CCP (A£Af, C, D, T) 
exposing interesting characteristics from a user-oriented point of view in a matchmaking 
scenario. Before defining such a class we explain the rationale behind its investigation using 
the following example. 

Example 3 Suppose we have the following situation: 

demand Dem = HomePC l~1 VhasMonitor.LCDmonitor 
supply Sup = Server l~l VhasMonitor.CRTmonitor 

As T |= Dem U Sup = _L the demander can contract Dem in order to regain the satisfiability 
with Sup. Two solutions for the CCP Q = (A£J\f, Dem, Sup, T) are: 

G> = HomePC 
< K> = PC n (> 1 hasSof tware) n (= 1 hasOS) 
n VhasMonitor.LCDmonitor 

v 

{Gy = VhasMonitor.LCDmonitor 
K\/ = HomePC 

In (G>,K>) the demander should give up the specification on HomePC; in (G\/,Ky) the 
demander should give up only some specifications on the monitor type while keeping the 
rest. 

Observe that both solutions are in the previously defined class SOLCCP n (Q), but from 
a user-oriented point of view, (Gy,K\/) seems the most reasonable solution to Q. Giving 
up the HomePC concept in Dem — and then (> 1 hasMonitor) because of the axiom on 
HomePC — the demander keeps all the specifications on requested components, but they are 
vacuously true, since K> n Sup implies VhasMonitor.± i.e., no component is admitted. 

In order to make our intuition more precise, we introduce the number-restriction-minimal 
solutions for Q, whose set we denote SOLCCPj^{Q). Intuitively, a solution (G,K) for Q is 
in SOLCCPj^(Q) when an at-least restriction (> n R) is in G only if it directly conflicts 
with an at-most restriction (< m R) (with m < n) in D. Solutions in which the at- 
least restriction is given up because of conflicting universal role quantifications — e.g., \/R.A 
and MR.^A — are not in SOLCCPj\f(Q). Since this characteristic of number-restriction- 
minimal solutions should be enforced at any level of nesting, we first introduce the role 
path of a concept in A£N. Here we need to distinguish between a concept A and its 
(different) occurrences in another concept, e.g., B = An \/R.A. In theory, we should mark 
each occurrence with a number, e.g., A 1 n \/R.A 2 ; however, since we need to focus on one 
occurrence at a time, we just mark it as A. 
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Definition 12 Given a concept B in ACM , and an occurrence A of an atomic (sub)concept 
A in B, a role path for A in B, H-j(B) is a string such that: 

- H-j(A) = e, where e denotes the empty string 

- H-j(Bi n B2) = H-^(Bi), where Bi, i G {1,2}, is the concept in which the occurrence 
of A appears 

- H-j(yR.B) = RoH-^(B), where o denotes string concatenation 

The role path H-^(B) represents the role nesting of a concept A occurrence into a concept 
B. Note that U-^(B) is the same for any commutation of conjunctions in B, and for any 
rearrangement of universal role quantifications — if A was not atomic, this would not be 
true 7 . Using the previous definition we can now define SOLCCP_\f(Q). 

Definition 13 Let Q = (ACM,C, D,T) be a CCP. The set SOLCCP^iQ) is the subset 
of solutions (G,K) in SOLCCPn(Q) such that if (> n R) occurs in G then there exists 
(< m R), withm < n, occurring in CNF(D,T) and n ( > n R) {G) = I± ( < m R) (CNF(D,T)). 

We now illustrate an algorithm findContract that returns a solution (G, K) G SOLCCPj^(Q) 
for Q = (ACM, CNF(C, T), CNF(D, T), 0), that is, it compares two ^TV-concepts C, and 
D, both already in CNF w.r.t. a TBox T, and computes a number-restriction minimal con- 
traction (G, K) of C w.r.t. D without considering the TBox. 

Algorithm findContract (C, D); 

input ACM concepts C, D, both already in CNF 

output number-restriction minimal contraction (G,K), 
where (G, K) = (T, C) means that C U D is satisfiable 

variables concepts G, K, G' , K 1 
begin 

1. if C = ± 

then return (_L,T); /* see comment 1 */ 

2. G := T; K := T n C; /* see comment 2 */ 

3. for each concept name A G K names+ 

if there exists a concept ->A G D names ^ 
then G := GnA; delete A from K; 

4. for each concept (> x R) € K$ 

such that there is a concept (< y R) G D$ with y < x 
G := G n (> x R); delete (> x R) from K; 

5. for each concept (< x R) G K$ 

such that there is a concept (> y R) G with y > x 
G:=Gn(< x R); delete (< x R) from K; 

6. for each concept VR.F G K a u 

if there exist VR.E G D a u and ( 
either (> x R) G K$ with x > 1 

7. For readers that are familiar with the concept- centered normal form of concepts (Baader et al., 2003), 
we note that U-j-(B) is a word for Ua in the concept-centered normal form of B. 
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or (> x R) e Dj with x >1 ) 
then let (G',K') be the result of findContract (F, E) in 

G := G n MR.G'; 

replace Vi?.F in K with \/R.K'\ 
7. return (G,if); 
end. 

Let us comment on the algorithm: 

1. the case in Step 1 cannot occur at the top level, since we assumed C and D be satisfi- 
able in the definition of CCP. However, _L may occur inside a universal quantification — 
e.g.,C = V.R._L — hence, the case of Step 1 may apply in a recursive call of findContract, 
issued from Step 6 of an outer call. 

2. in Step 2, the conjunction T n C is assigned to K in order to leave T in K if every 
other concept is removed by the subsequent steps. 

We denote by {G 9 ,K 9 ) solutions for the CCP Q = {ACM, CNF(C,T), CNF(D,T),<b). In 
this simplified CCP Qq, we completely unfold T in both C and D and then forget it. 

Theorem 4 The pair (G,K) computed by findContract (C, D) is a number-restriction- 
minimal contraction for Q = {ACM, CNF{C,T), CNF(D,T),$). 

Proof. We first prove that {G,K} is a solution for Qq, namely, that (i) G n K = C, 
and that (ii) K n D is satisfiable. We prove (i) by induction. For the base cases, observe 
that the claim is true in Step 2 by construction, and that in Steps 3-5 when a conjunct 
is deleted from K, it is also added to G. Hence the claim holds when no recursive call is 
made. For the inductive case, assume the claim holds for each recursive call in Step 6, that 
is, G' n K' = F for every concept VR.F £ K a u. Let G n , K n be the values of variables G, K 
before the execution of Step 6, and let K~ be the concept K n without VR.F. Then, after 
Step 6 it is: 

G n K = (by assigment) 
G n n MR.G' n K~ n \/R.K f = (by definition of V) 
G n n K~ n VR.(G f n K') = (by inductive hypothesis) 
G n nK~ nVR.F = (by definition of K~) 

G n l~l K n = (since the base case holds before Step 6) 
C 

Regarding (ii), the proof is again by induction, where the inductive hypothesis is that 
K' n E is satisfiable. Basically, we construct an interpretation (A, - 1 ) with an element x 
such that x £ (Kn D) x , and show that we can keep constructing T without contradictions, 
since contradicting concepts have been deleted from K. In the inductive case, we assume 
the existence of an interpretation (A', for K'nE such that y G A' n (K' n E)^ , and then 
build a joint interpretation (A", • J ") by letting A" = A 1+J A', J" = J U J U {{x, y) £ R 1 "}. 

We now prove that (G,K) is a number-restriction-minimal solution for Qq. The proof 
is by induction on the Quantification Nesting (QN) of C, defined in Section 3.1. Observe 
that an at-least restriction is deleted from K only in Step 4 of findContract. For the base 
case — QN(C) = 0, no recursive call — observe that the role path of a retracted concept 
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(> n R) in G is e, same as the role path of the concept (< m R) in D causing Step 4 to 
be executed. Hence, the claim holds in the base case. For the inductive case, assume that 
the claim holds for all concepts with QNs smaller than QN(C). Observe that the concept 
F in Step 6 is such a concept, since its QN is smaller by at least 1. Hence, if an (occurrence 
of an) at-least restriction (> x R), with role path - R ^ (F) is deleted in F, there exists 
a conflicting at-most restriction in E with the same role path. Since both F and E occur 
inside the scope of a concept VR.F, \/R.E respectively, the claim still holds with role path 
U J>^(C) = R n J ^(F). □ 



6.2 Contraction-Based Ranking of Partial Matches 

We now define a penalty function p± for partial matches based on the following intuition: 
the partial matches should be ranked based on how many characteristics should be retracted 
from each C to make them potential matches. 

Algorithm penalty P artial (C, D); 

input ACM concepts C, D, both already in CNF 

output a penalty for the partial match between C and D 
where zero means that C l~l D is satisfiable 

variables integer n 
begin 

1. if C = ± 

then return \D\; /* see Comment 1 */ 

2. n = 0; 

3. for each concept name A G C names+ 

if there exists a concept —>A € D names ^ 
then n := n + 1; 

4. for each concept (> x R) € Cjj 

such that there is a concept (< y R) G Djj with y < x 
n := n + 1; 

5. for each concept (< x R) G Cj 

such that there is a concept (> y R) G -Djj with y > x 
n := n + 1; 

6. for each concept Vi?.F G C ji 

if there exist VR.E G D a » and ( 

either ((> x R) G Cj and (< y R) ^ D% with x > y) /* see Comment 2 */ 

or (> x R) e -Djt with x > 1 ) 
then n := n + penalty P artial (F, E); 

7. return n; 
end. 

The above algorithm has a structure very similar to findC'ontract: whenever findContract 
removes concepts from K, penalty Partial adds penalties to n. The two differences are 
explained in the following comments: 
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1. Step 1 adds the whole length of D when C = _L. This addition ensures antimonotonic- 
ity in the presence of _L, as explained in Example 4 below. 

2. Step 6 has in penalty Partial the additional condition "and (< y R) Dj with x > y" . 
This condition is necessary because penalty Partial does not actually remove concepts, 
but just counts them. If an at-least restriction in Cjj is in contrast with an at-most 
restriction in D$, then findContract removes it from K, while penalty Partial just adds 
1 to n. Yet, when the condition in Step 6 is evaluated, findContract finds it false just 
because the at-least restriction has been removed, while penalty Partial would find it 
true, were it not for the additional condition. 

We now use the outcome of penalty Partial to define a penalty function for partial matches. 

Definition 14 Given a simple TBox T in ACM , let the penalty function pj_ for the partial 
match of a counteroffer C given an offer D, where both C and D are concepts in ACM , be 
as follows. 

P± (C, D, T) = penalty Partial (CNF (C, T), CNF(D, T)) (3) 

Note that since penalty Partial closely follows findContract and findlrred , in fact Formula (3) 
is more similar to Formula (1) in Definition 9 than it might appear. Implicitly, we solve 
Q = (ACM, CNF(C, T), CNF(D, T), 0), and then use the result in the computation of the 
penalty function, with a main difference in Step 1, though. We explain such a difference 
with the help of an example. 

Example 4 Let Demi and Demi be two demands, where Demi Qt Demi, and let Sup be 
a supply, all modeled using the ontology T in Figure 1 as in the following: 

Demi = PC l~l VhasMonitor.CRTmonitor 
Dem2 = PC l~l VhasMonitor._L 
Sup = HomePC n VhasMonitor.LCDmonitor 

Computing findContract and penalty Partial for both CNF(Demi,T) and CNF(Dem2,T) 
w.r.t. CNF(Sup,T) we obtain: 

findContract(CNF(Demi,T),CNF(Sup,T)) = (VhasMonitor.CRTmonitor, 

PC l~l VhasMonitor.Monitor) 

penaltyPartial (CNF (Demi, T), CNF (Sup, T)) = 1 

findContract(CNF(Dem2,T),CNF(Sup,T)) = (VhasMonitor._L, PC) 
penaltyPartial(CNF(Dem 2 ,T),CNF(Sup,T)) = 3 

In summary, the concept _L conflicts with every other concept, yet when a concept 
V-R.-L is given up, its length is zero (or any other constant), hence the length of G cannot 
be directly used as an antimonotonic penalty function. This explains the importance of 
Step 1 in the above algorithm. 

We can show the following formal correspondence between p± and the Concept Contraction 
defined in the previous Section. 
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Theorem 5 Let Q = (ACAf,C,D,T) be a CCP, and let {G$,K$) the solution to re- 
turned by findContract (CNF (C,T), CNF(D,T)). If ' G$ does not contain any occurrence of 
the concept _L, then 

p ± (C,D,T) = \G 9 \ 

Proof. The function p± is based on penalty Partial , and by inspection, whenever penalty Partial 
increments n, findContract adds an atomic concept to Gq. The only exception is in Step 1 
of penalty Partial, which adds \D\ while findContract adds _L to Gq. However, this case is 
explicitly outside the claim. □ 

We now prove that p± is in accordance with properties highlighted in the previous Section. 

Theorem 6 The penalty function p± is (i) non- symmetric, (ii) syntax independent, and 
(Hi) antimonotonic over subsumption. 

Proof. (i) Non-symmetry is proven by example: let C = (< 1 R) n \/R.->A, D = 
(> 2 R) n VR.A. For simplicity, T = 0, and observe that both C and D are already in 
CNF. We now show that p±(C, D, 0) / p±{D, C, 0). In fact, in the former case, observe that 
C must give up everything: the at-most restriction because it is in contrast with the at-least 
restriction, and -*A inside universal quantification because it is in contrast with \/R.A in 
D. Hence, penalty Partial returns 2 = (1 from Step 5) + (1 from Step 1 of the recursive 
call). Hence, p±(C,D, 0) = 2. In the latter case, instead, once the at-least restriction is 
given up (and penalty Partial adds 1 to n in Step 4), since role fillers are no more imposed, 
the universal quantification is now compatible (the condition of the if in Step 6 is false). 
Hence p ± (D, C, 0) = 1. 

(ii) syntax independency is an immediate consequence of the fact that Formula (3) 
uses normal forms for concepts. Since normal forms are unique up to commutativity of 
conjunction — that can be fixed by imposing some order to conjunctions, e.g., lexicographic — 
the claim holds. 

(iii) antimonotonicity can be proved by induction on the QN of a generic concept C 
subsumed by C; we go through all conditions for subsumption, analyzing the changes in 
the behavior of the algorithm from C to C . Recall that our goal is now to prove that 
p±(C' , D,T) > p±(C, D,T). In order to make a clear distinction between the two compu- 
tations, we let n' be the (instance of the) variable used in the call to penalty Partial (C , D), 
while n is used in the call to penalty F 'artial(C ', D). To ease notation, we assume that C, C' 
are already in CNF. 

• First of all, it could be the case that C' = _L. In this case, n' = \D\ from Step 1 of 
penalty Partial. On the other hand, observe that penalty P 'artial (C, D) < \D\ because 
either C = _L too, or every increase in n corresponds to an atomic concept in D — by 
inspection of Steps 3-5, and this recursively in Step 6. Therefore, the claim holds for 
this base case. 

• Cnames Q C' names . For this case, it is obvious that Step 3 in penalty Partial can only 
make more increments to n' w.r.t. n, since for C' the number of iterations of the for 
each increases. 



296 



Semantic Matchmaking as Non-Monotonic Reasoning: A Description Logic Approach 



• for every number restriction in Cj, either the same number restriction appears in CV 
or it is strengthened (an at-least increases, an at-most decreases) in C'p Note that 
strengthening a number restriction in C' can only turn from false to true the condition 
for the increment of n in Steps 4-5. For instance, passing from (> x R) G Cj to 
(> x' i?) G C'd with x' > x, if there is (< y R) e. D$ then y < x implies y < x'. A 
similar argument holds for the at-most. Moreover, number restrictions that appear 
only in C\ can only increase the number of iterations of Steps 4-5, hence n' can only 
increase w.r.t. n and the claim holds. 

The above three cases prove the basis of the induction (C" with QN equal to 0). We now 
prove the case for universal role quantification, assuming that the claim holds for QNs less 
than QN(C). 

• for every MR.F 1 G C' a u, either R is not universally quantified in C a u, or there is 
\/R.F G C a u such that F' is subsumed by F (with F' = F as a special case of subsump- 
tion). Roles which are not universally quantified in C a u but are quantified in C' a u, 
can only increase the number of iterations of Step 6, hence n' can only increase due to 
their presence. For roles that have a more specific restriction F', the inductive hypoth- 
esis is assumed to hold, since QN(F') < QN(C). Hence p ± (F',E,T) > p±(F,E,T). 
This is equivalent to penaltyP artial{F' , E) > penaltyPartial(F, E) . Moreover, if the 
condition in Step 6 is true in the call penaltyPartial(C, D), then it is also true in 
penalty Partial (C , D), since MR.F 1 G C' all , and (> x' R) G C'p hence if the recursive 
call penalty Partial (F, E) is issued, then also penalty Partial (F' , E) is issued, increasing 
n' at least as much as n is increased, by inductive hypothesis. Hence the claim holds 
also in the inductive case. 



7. The Matchmaking System 

The DLs-based approach to semantic matchmaking illustrated in previous Sections has been 
implemented in the ACM reasoning engine MaMaS (MatchMaking Service). It features all 
classical inference services of a DL reasoner, but also implements algorithms for the non- 
standard services for matchmaking presented in previous Sections. 

MaMaS is a multi-user, multi-ontology Java servlet based system; it is available as an 
HTTP service at: http : //dee227 .poliba. it : 8080/MAMAS-tng/DIG, and exposes a DIG 
l.l 8 compliant interface. The basic DIG 1.1 has been extended to cope with non standard 
services, and we briefly describe here such additions. 

New elements: 

• Match type detection: <matchType>El E2</matchType>- computes the match type 
according to the following classification: Exact (equivalence), Full, Plug-in, Potential, 
Partial. 

8. DIG 1.1 is the new standardized DL systems interface developed by the Description Logic Implementation 
Group (DIG) (Haarslev & Moller, 2003). 
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• Concept Abduction: <abduce>El E2</abduce> - implements findlrred. 

• Concept Contraction: <contract>El E2</contract>- implements findContract. 

• Ranking Score: <rank type="potential">El E2</rank> 

<rank type="partial">El E2</rank>- computes p\z(C,D,T) and p±(C, D,T) as 
presented in previous Sections. 

New attributes for <newKB/> 

• shared: the only values to be used are true and false. In MaMaS, when a new 
knowledge base is created, each KB uri is associated with the IP address of the client 
host (owner) instantiating the KB. If the shared attribute is set to false, only the 
owner is authorized to submit tells statements and change the KB as well as to submit 
asks. In this case, requests from IP addresses different from the owner's one can be 
only asks. If the shared attribute is set to true, then no restriction is set on both 
tells and asks statements. True is the default value. 

• permanent: the only values to be used are true and false. In MaMaS, if a KB is 
not used for more than 300 seconds, the KB is automatically released. If a user wants 
to maintain the KB indefinitely, the permanent attribute must be set to true; false 
is the default value. 

It should also be pointed out that MaMaS only supports simple-TBox, that is, concept 
axioms have a concept name on the left side 9 . 

We have been using MaMaS as matching engine in various applications, including e- 
marketplaces, (see e.g., Colucci, Di Noia, Di Sciascio, Donini, Ragone, & Rizzi, 2006; 
Colucci et al., 2005) and semantic web services discovery (Ragone, Di Noia, Di Sciascio, 
Donini, Colucci, & Colasuonno, 2007). We do not delve in details of such applications here, 
and refer the interested reader to the cited references. 

7.1 Experimental Evaluation 

The hypothesis we seek to confirm in this Section is that our approach performs effectively 
in a wide range of matchmaking scenarios, i.e., it is able to model commonsense human 
behavior in analyzing and ranking, given a request, available offers. Hence the experimental 
framework relies on comparison of system behavior versus the judgement of human users. 
Furthermore, although our system may allow the use of weights to increase the relevance of 
concepts, in the following results refer to the basic "unweighted" version of the system, to 
avoid biasing of results due to weights introduction. 

The scenarios we tested our approach on were three: apartments rental, date/partner 
finding, skill management for recruiting agencies. Several ontology design methodologies 
have been proposed (Jones, Bench-Capon, & Visser, 1998); we adopted the one proposed 
by N.F. Noy and D.L. McGuinness (2001). 

9. Notice that since MaMaS supports ACM , only atomic negation can be expressed and then <disjoint/> 
groups must contain only concepts specialized by an <impliesc> axiom (sub-concept axiom). Defined 
concepts <equalc/> (same-class) are not admitted in a disjoint group. 
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For all three scenarios we carried out a thorough domain analysis, starting with a large 
set of advertisements taken from newspapers or from descriptions of on-line agencies, and 
designed ontologies describing the domain. In particular: 

• Apartments rental ontology is made up of 146 concepts (primitive + defined) and 33 
roles. 

• Date/partner matching ontology is made up of 131 concepts (primitive + defined) 
and 29 roles. 

• Skill matching ontology is made up of 308 concepts (primitive + defined) and 38 roles. 

For each scenario we selected several announcements. The total number used in the ex- 
periments with human users is 180 (120 offers, 60 requests) for the apartments rental, 215 
(140 offers, 75 requests) for the skill matching. 100 advertisements for the Date matching 
scenario were also selected, yet for these we did not actually distinguish among requests 
and offers as announcements were in the form of profiles, although they included preferences 
for dating partner. All announcements were in natural language and they were manually 
translated in DL syntax. We then created, for each domain, 50 sets of questionnaires. 
Questionnaires were in the form of one request (a demand or a supply) and 10 offering ad- 
vertisements. Three groups of ten randomly selected volunteers, were then asked to order, 
according to their judgement advertisements, with respect to the given requests. Having 
obtained average users rankings, we run the same sets of advertisements with our system, 
which gave us a set of system provided rankings. System rankings that included partial 
matching advertisements were simply ordered below worst potential matching advertise- 
ment. We adopted, as reference, a standard Vector Space Model (VSM) (Salton & Gill, 
1983) system. We used terms in our ontologies "flattening" the ontology descriptions, as di- 
mensions of three separate vector spaces, and determined weights using classical TF * IDF 
measure. Similarity results were computed using the well-known Cosine similarity measure 
(Salton & Gill, 1983). 

To summarize results we adopted the R n0 rm (Bollmann, Jochum, Reiner, Weissmann, 
& Zuse, 1985) as quality measure of our system effectiveness. R n orm is defined as follows. 
Given Sup, a finite set of descriptions with a user-defined preference relation > that is 
complete and transitive, let A usr be the rank ordering of Sup induced by users preference 
relation, and let A sys be the system-provided ranking. R norm is then defined as: 

i?„(A^) = \ ■ (1 + ^^-) 

^ o max 

where S + is the number of descriptions pairs where a better description is ranked by the 
system ahead of a worse one; S~ is the number of pairs where a worse description is ranked 
ahead of a better one and S max is the maximum possible number of S + . It should be noticed 
that the calculation of S + , S~, and S max is based on the ranking of descriptions pairs in 
A sys relative to the ranking of corresponding descriptions pairs in A usr . R n0 rm values are 
in the range [0,1]; a value of 1 corresponds to a system-provided ordering of the available 
descriptions that is either identical to the one provided by the human users or has a higher 
degree of resolution, lower values correspond to a proportional disagreement between the 
two. For the three scenarios considered, results are presented in table 3. 
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Domain 


MaMaS 


VSM 


Apartments rental 


0.87 


0.48 


Date/partner matching 


0.79 


0.41 


Skill matching 


0.91 


0.46 



Table 3: R n0 rm values: MaMaS: Semantic matchmaking results, VSM: Vector Space Model 
results 



Although they present a variability, which we believe is partly due to the ability to 
capture the domain in the ontologies design, they show that our approach provides rankings 
that are close to human commonsense behavior and are far better than those obtained with 
unstructured text retrieval tools. 

8. Conclusion 

We have addressed the matchmaking problem between descriptions from a DL perspective. 
We have analyzed semantic-based matchmaking process and devised general commonsense 
properties a matchmaker should have. We have also pointed out that classical inference 
services of DLs, such as satisfiability and subsumption, are needed and useful, but may be 
not sufficient to cope with challenges posed by matchmaking in an open environment. 

Motivated by this we have studied Concept Abduction and Contraction as novel non- 
monotonic inferences in DLs suitable for modeling semantic-based matchmaking scenarios. 
We analyzed minimality criteria, and proved simple complexity results. We also presented 
reasonable algorithms for classifying and ranking matches based on the devised inferences 
in terms of penalty functions, and proved that they obey to properties individuated. 

Although several other measures may be determined to compute a score for "most 
promising" matches our proposal has logical foundations and we have empyrically shown it 
is able to well simulate commonsense human reasoning. Obviously, as any other semantic- 
based approach, also our own has to rely on well-designed ontologies able to model the 
application domain being considered. 

Based on the theoretical work we have implemented a fully functional matchmaking 
facilitator, oriented to both generic e-marketplace advertisements and to semantic-based 
web-service discovery, which exploits state of art technologies and protocols, and it is, to 
the best of our knowledge, the only running system able to cope with Concept Abduction 
and Concept Contraction problems. 

With specific reference to earlier work of the authors on the subject, Di Sciascio et al. 
(2001) defined matchmaking as satisfiability of concept conjunction. Definitions of potential 
match and near-miss i.e., partial match, in terms of abduction and belief-revision were out- 
lined, and the need for ranking of matches motivated, in the work of Di Sciascio, Donini, and 
Mongiello (2002). Di Noia et al. (2003b, 2003c) proposed a semantic-based categorization of 
matches, logic-based ranking of matches within categories, and properties ranking functions 
should have, in the framework of E-marketplaces. An extended and revised version of such 
works is in (Di Noia, Di Sciascio, Donini, & Mongiello, 2004). Di Noia et al. (2003a) intro- 
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duced Concept Abduction in DLs and presented algorithms to solve a Concept Abduction 
Problem in ACM. Colucci et al. (2003) proposed both Concept Abduction and Concept 
Contraction as inferences suitable for semantic-matchmaking and explanation services. Call 
et al. (2004) proposed a basic approach adopting penalty functions ranking, in the frame- 
work of dating systems. Colucci et al. (2004) proposed initial results and algorithms based 
on truth-prefixed tableau to solve Concept Abduction and Contraction problems in ACM. 
Colucci et al. (2005) showed that such services can be usefully adopted both for semantic- 
matchmaking and for finding negotiation spaces in an E-Commerce setting. The use of the 
proposed inference services for refinement purposes in the semantic-matchmaking process 
has been outlined in the work of Colucci et al. (2006). 

Our current research is oriented to the investigation of algorithms for more expressive 
DLs and the development of a tableaux-based system for the proposed inference services. 
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Appendix A. Rules for Normal Form 

The normal form of a concept can be obtained by repeatedly applying the rules of the two 
following Sections, until no rule is applicable at any level of nesting of concepts inside MR.C. 

A.l Rules Involving Subconcepts 

In the following rules, the n symbol on the l.h.s. should be considered as an associative and 
commutative operator; hence, for instance, when writing (> n R) n (< m R) in the second 
rule, this should be read as the concepts (> 
a conjunction of two or more concepts. 

Cni 

(> n R) n (< m R) 

An^A 

(> n R) n (> m R) 
(< n R) n (< m R) 
Vi?.L>i n VR.D 2 



n R) and (< m R) appear in any order inside 

-> _L 

— ► _L if n > m 

-> _L 

— ► (> n R) if n > m 

— > (< n R) if n < m 

-► Vi2.(Z>i n D 2 ) 

-► Vi?.i_ n (< or) 
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A. 2 Rules Involving the Concept and the TBox 

A -» AnC if AQCeT 
A -» C ifi = CeT 

A -> A n n ■ ■ ■ n ->B k if disj(A, Bi,..., B k ) e T 

Usually the concept resulting from the application of the above rules is referred to as an 
expansion, or unfolding of a TBox. 

A. 3 Properties of the Normal Form 

Let C be a concept in Classic, and let C be any concept obtained from C by repeatedly 
appying the above rules. Let \C\, \C'\ denote the size of C, C respectively. It can be proved 
(Borgida & Patel- Schneider, 1994) that: 

1. if \C'\ is polynomially bounded in |C|, then C' can be computed in time 0(\C\ 2 ); 

2. every concept resulting from the application of the rules is equivalent to C, w.r.t. 
models of the TBox. 

As a consequence of the latter property, C is unsatisfiable iff its normal form is _L. Then, 
as a consequence of the former property, unsatisfiability can be decided in polynomial time 
(Borgida & Patel-Schneider, 1994). The fact that \C'\ is polynomially bounded in |C| has 
been intuitively related by Nebel (1990) to the form of TBoxes, that should be "bushy but 
not deep". A more precise definition has been given by Colucci et al. (2004). 
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